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Abstract 


A  conceptual  schema  has  often  been  proposed  as  a  common  basis  for 
support  of  many  data  base  views.  We  outline  a  set  of  proposed  functions  and 
facilities  which  can  support  hierarchical,  network  and  relational  organizations 
of  data.  Common  integrity  constraints  for  the  data  base  can  be  enforced  at 
this  level  irrespective  of  user's  view. 


Table  of  Contents 


1.  Introduction  . 

2.  Basic  objects  . 

2.1.  Record  types  . 

2.2.  Selectors  . 

2.3.  Links  . 

2.4.  Expressions  . 

3.  Data  Definition  Language 

4.  Intention  Statements  ... 

5.  Generation  Statements  .. 

6 .  DDL  commands  . 

7.  Example  . 

8.  Concluding  remarks  . 


1 

2 

3 

4 

5 

7 

8 

11 

15 

17 

19 

21 


References 


1.  Introduction 


Data  Base  Management  Systems  (DBMS's)  can  be  divided  in  three  main 
categories  according  to  the  basic  data  organization  offered  to  their  users. 
The  three  approaches  are  hierarchical  [14]  ,  Network  [7]  and  Relational  [8, 

9,  10,  11,  12],  Each  one  of  the  views  is  important,  not  only  for  historitcal 
reasons,  but  because  it  can  provide  some  distinct  advantages  for  certain 
applications.  In  future  DBMS's,  all  views  may  have  to  be  offered  to  please 
a  diverse  population  of  users.  Acknowledging  this  requirement,  a  common 
facility  has  been  proposed  under  the  name  of  a  conceptual  schema  [16] .  The 
conceptual  schema  can  be  regarded  as  a  common  denominator  for  all  three  views 
It  is  not  unrealistic  to  consider  the  conceptual  schema  as  a  viable  DBMS 
approach.  All  the  DBMS's  serve  the  same  purpose  by  structuring  data  in  appro 
priate  ways.  The  mechanisms  used  for  their  implementation  are  also  very 
similar,  addition  all  DBMS's  have  to  run  ultimately  on  the  same  type  of 

hardware . 

There  is  always  a  temptation  to  isolate  one  particular  model  and  pro¬ 
pose  it  as  the  common  basis  for  every  view.  For  instance  the  hierarchical 
approach  has  some  distinct  advantages  in  efficiency  and  ease  of  implementa¬ 
tion.  The  DBTG  network  model  is  all  -  encompassing  and  it  offers  a  great 
number  of  options  for  extra  flexibility  and  potential  optimization.  Finally, 
the  relational  model  offers  a  very  clean  and  well  understood  interface,  on 
which  other  views  can  be  supported.  However,  the  conceptual  schema  differs 
slightly  in  terms  of  purpose  from  any  particular  approach  for  two  reasons. 
First,  a  conceptual  schema  does  not  have  to  have  many  end-user  oriented  fea¬ 
tures.  These  features  can  be  provided  by  particular  DBMS's  at  higher  levels. 
Second,  a  conceptual  schema  should  have  adequate  optimization  tools.  That 
is,  the  level  of  its  functions  should  be  low  enough  to  permit  some  control 
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over  access  paths  and  algorithms.  There  is  no  inherent  reason  for  the  con¬ 
ceptual  schema  to  be  exactly  hierarchical,  network,  or  relational. 

We  will  propose  a  conceptual  schema  which  is  based  on  networks,  but 
does  not  exactly  follow  all  the  rules  of  DBTG  [7].  Neither  does  it  offer 
all  the  options  and  facilities  of  DBTG.  As  a  matter  of  fact,  a  conceptual 
schema  forces  a  separation  of  DBTG  functions  into  two  categories:  end-user 
oriented  and  system  oriented.  The  system  oriented  facilities  are  retained  in 
the  conceptual  schema,  while  the  end-user  oriented  facilities  are  at  a  higher 
level  of  the  system. 

The  main  properties  which  we  will  seek  in  a  proposed  conceptual 
schema  can  be  summarized  as : 

1)  Enough  flexibility  to  support  different  views  of  data. 

2)  Adequate  structuring  and  optimization  tools. 

3)  A  framework  for  common  integrity  constraints  for  the  data 
base . 

4)  Absence  of  all  purely  physical  properties  of  data. 

We  hope  that  we  will  come  reasonably  close  to  satisfying  some  of  these 
requirements  with  our  proposed  features  for  a  conceptual  schema. 


2 .  Basic  Ob-jects 

We  propose  four  basic  objects  for  the  definition  of  structure  among 
data.  All  of  them  appear  in  different  ways  in  all  three  approaches  of  DBMS 
organization.  They  also  appear  in  many  models  of  semantic  properties  of  data. 
The  four  basic  objects  correspond  to  four  elementary  facilities  that  any 


DBMS  should  have. 
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1)  The  ability  to  define  data  pools  where  data  is  stored. 

2)  The  ability  to  select  appropriate  data  in  each  data  pool. 

3)  The  ability  to  connect  different  data  pools. 

4)  The  ability  to  combine  the  previous  facilities. 


2.1.  Record  types 

For  the  data  pool  facility  we  adopt  the  record  type  concept  in 
the  same  form  as  it  exists  in  DBTG.  A  record  type  is  a  named  collection  of 
records  of  similar  type.  Each  record  type  consists  of  a  number  of  data  item 
types.  The  data  item  types  represent  data  attributes  which  can  take  values 
over  a  domain  of  similar  type  values.  Semantically  record  types  usually  stand 
for  entities  of  the  real  world  and  the  data  items  stand  for  their  attributes 
or  characteristics. 

There  have  been  proposals  to  consider  attributes  or  characteristics 
as  independent  basic  objects  related  through  binary  relations  [1,  18] .  Even¬ 
tually  though,  some  grouping  of  these  characteristics  has  to  take  place  in 
order  to  mirror  certain  semantic  properties  [17],  We  favour,  therefore, 
record  types  as  they  represent  objects  and  their  grouped  attributes.  We  will 
not  elaborate  on  the  method  of  grouping  the  characteristics  in  record  types. 

Both  semantic  properties  and  functional  properties  are  important  in  choosing 
the  record  types.  Third  normal  form  should  be  a  major  consideration  [9,  11, 

4] .  Another  very  important  reason  to  choose  record  types  as  basic  objects  is 
that  they  appear  somehow  in  every  DBMS  organization. 

In  hierarchical  systems  they  appear  as  segment  types,  in  network  systems 
as  record  types  and  in  relational  systems  they  correspond  approximately  to 


primary  relations  [8] . 


-  A  - 


We  will  not  insist  on  any  particular  implementation  for  the  record 
types.  In  most  systems  they  are  implemented  as  straight  files.  However, 

DBTG  does  not  preclude  other  methods  for  record  type  implementation.  For 
instance,  for  certain  environments  a  binary  relation  implementation  [6]  or 
a  transposed  file  implementation  can  be  very  appropriate. 

The  record  types  will  usually  be  denoted  by  capital  letters  X,  Y,  Z, 
or  by  names  like  EMPLOYEE,  PARTj DEPT  corresponding  to  entities.  The  number 
of  records  which  can  be  associated  with  a  record  type  is  theoretically  un¬ 
bounded.  Each  record  has  the  same  type,  but  not  necessarily  the  same  size. 

The  order  of  records  is  important.  Copies  of  the  same  record  are  allowed  and 
are  treated  independently.  The  last  two  assumptions  clearly  separate  record 
types  from  conceptual  relations  [8] . 

The  data  items  within  the  record  types  will  be  denoted  by  positional 
notation  Yj^,  or  by  names  related  to  their  meaning,  like  SALARY,  ADDRESS, 
etc.  If  data  item  names  are  not  unique,  they  have  to  be  qualified  by  their 
corresponding  record  type  like  EMPg^, 

2 . 2  Selectors 

Selectors  provide  the  ability  to  qualify  and  isolate  a  set  of  records 
of  the  same  type.  The  selection  criterion  can  be  very  general.  It  can  be 
based  on  data  item  values,  or  order  within  the  type,  or  even  properties  not 
expressible  in  closed  form.  The  selection  according  to  data  item  values  is 
similar  to  qualification  of  tuples  in  relational  systems  [5] .  However,  copies 
are  obtained  unless  otherwise  specified.  One  can  also  select  according  to 
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order,  e.g.,  the  first,  last,  next,  previous,  or  record.  In  addition 

one  can  browse  through  the  record  type  and  select  appropriate  records 
(manual)  selection. 

Selectors  are  named  basic  objects  which  represent  the  subset  of 
records  selected.  On  each  record  type  we  can  define  many  selectors.  As 
we  will  see  later  on,  we  can  then  either  save  the  definitions,  or  even 
retain  an  appropriate  structure  giving  faster  access  to  selected  records. 

Selectors  will  be  denoted  either  by  the  capital  letter  S  and  a  sub¬ 
script  for  the  corresponding  record  type  S^j  or  by  specific  names  related  to 
their  meaning  like  SICK  EMPLOYEES,  FATCATS ,  etc. 

All  DBMS  approaches  need  selection  in  various  degrees.  In  some 
systems  selection  is  very  primitive,  handling  one  record  at  a  time.  We  feel 
however,  that  the  underlying  system  should  have  the  capability  for  a  very 
flexible  selection  definition.  This  should  be  the  case  even  if  the  user  is 
forced  to  retrieve  one  record  at  a  time  due  to  host  language  constraints. 

2.3.  Links 

The  ability  to  define  record  types  and  select  records  is  present  even 
in  file  systems.  What  distinguishes  the  DBMS  approach  is  that  there  is  also 
an  ability  to  connect  and  correlate  the  data  present  in  different  record  types. 
In  hierarchical  systems  this  ability  exists  through  the  hierarchical  structure 
and  the  data  base  trees.  In  network  systems  it  exists  through  the  facility 
of  DBTG  sets.  Finally,  in  relational  systems  it  exists  through  the  facility 
of  derived  relations. 


Consider  two  record  types  X  and  Y.  A  link  between  the  record  types. 
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is  a  mapping  between  individual  records  of  the  record  types.  That  is, 

each  record  of  type  X  can  be  associated  with  some  record  of  type  Y  and  vice 

versa.  We  impose  no  rules  for  the  relationships  expressed  by  links.  They 

can  be  1:1,  1:N,  or  N:M  relationships.  Each  link  is  a  bidirectional  mapping. 

We  can  go  from  X  to  Y,  or  from  Y  to  X.  However,  when  we  write  L  ,  we  will 

y  X 

imply  going  in  one  direction,  from  X  to  Y,  similar  to  the  mapping  facilities 

of  SQUARE  [5j .  Links  will  be  denoted  by  the  notation  L  ,  or  by  names  like 

y  X 

FATHER  OF,  MARRIED  TO,  etc. 


Between  two  record  types  we  can  define  many  different  links,  each 
having  a  separate  name.  This  situation  exists  also  in  real  life.  For  instance, 
between  two  persons  there  can  be  connections  of  kinship,  management,  debt, 
marriage,  etc.  Note  that  a  link  is  allowed  to  connect  records  of  the  same 
record  type.  Such  a  link  L  is  not  only  permitted,  but  it  is  also  often 
essential.  DBTG  does  not  permit  such  sets,  but  the  situation  may  change. 

There  is  an  increasing  awareness  of  the  importance  and  flexibility  of  such  a 
link. 


One  reason  for  the  presence  of  links  is  to  encode  extra  information 
corresponding  to  the  relationship  they  implement.  Such  links  are  called  infor¬ 
mation  carrying  links.  Links  can  also  be  inferred  according  to  the  values 
of  certain  data  items  of  the  record.  Such  links  are  only  present  to  provide 
a  faster  access  path.  They  are  called  non-information  carrying  links.  For 
example,  the  connection  in  a  hierarchical  system  between  a  father  segment 
type  and  a  son  segment  type  is  often  an  information  carrying  link.  The  connec¬ 
tion  between  two  relations  expressed  by  the  join  of  the  two  relations  is  a 
non-information  carrying  link. 


Links  can  be  constructed  according  to  well  defined  properties  involving 
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values  of  data  items  and  can  be  expressed  in  closed  form.  Such  links  are 
always  non-information  carrying.  In  addition,  links  can  be  constructed  manually 
be  indicating  that  certain  records  are  related.  Such  links  can  also  be  infor¬ 
mation  carrying. 

2.4  Expressions 

We  now  have  the  ability  to  isolate  records  through  selectors  and  to 
connect,  or  relate  them  through  links  to  other  records  of  a  different  type. 

We  also  need  to  compose  these  two  facilities  in  various  ways  to  navigate  in 
the  data  base  along  the  connections  provided  by  the  links.  Notice  that  our 
navigation  will  be  slightly  different  than  DBTG  navigation  [2] .  We  will 
navigate  handling  sets  of  records  rather  than  a  record  at  a  time. 

We  define  expressions  of  links  and  selectors  in  the  obvious  way  [19] 


1) 

E  is 

s  , 

or  E  a  E 

X  X 

X 

X  y  y  X 

2) 

E  is 

L 

,  or  E  «  E 

y  X 

y  X 

y  z  z  X 

3)  Boolean  closure  of  expressions  with  similar  left  side.  This  gives 

us  the  ability  to  have  expressions  E  with  multiple  sources. 

z  X  ,y 

One  can  also  define  expressions  with  multiple  destinations,  e.g. 

E  .That  means  that  we  start  with  records  X,Y  and  in  some  meaningful 
way  we  get  records  W,  Z.  We  prefer  though  to  leave  aside  such  complications. 

Expressions  of  links  and  selectors  look  syntactically  similar  to 
SQUARE  expressions,  but  not  identical.  In  a  SQUARE  expression,  each  term 
signifies  a  mapping  inside  a  relation  from  column  to  column.  The  composition 
operator  introduces  the  values  obtained  from  one  relation  to  another  relation. 
In  our  expressions  each  term  is  a  selector,  or  link  connecting  two  different 
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record  types.  The  composition  relates  not  sets  of  values  but  sets  of  records. 


An  expression  corresponds  semantically  to  the  set  of  records  obtained 
by  navigating  through  the  record  types  in  a  right  to  left  direction  in  the 
expression.  In  addition,  we  perform  set  operations  as  denoted  by  the  expression. 
For  instance,  the  expression 


E  =  L  L  •  S 
yx^z  yzyx  x 

represents  the  navigation:  "Select  X  records  according  to  S  ^  then  obtain  the 


Y  records  connected  to  the  set  by  L  .  Intersect  this  new  set  of  Y  records 

y  X 

with  the  records  Y  which  are  connected  to  existing  records  Z  by  the  link  L  . 

y  z 

The  net  result  is  a  set  of  records  Y". 


Each  expression  is  applied  by  default  to  all  records  of  a  particular 
type  unless  restricted  explicitly  by  selectors,  or  links.  We  can  also  apply  the 
expression  to  a  subset  of  records  as  given  by  another  expression,  or  to  a  indi¬ 
vidual  record.  We  can  also  project  a  set  of  records  represented  by  an  expression 
to  an  individual  data  item  to  obtain  a  set  of  values.  These  values  can  be  further 
used  in  the  definition  of  new  selectors.  All  these  facilities  give  us  a  complete 
capability  to  manipulate  both  records  and  their  data  item  values. 


The  notation  of  expressions  is  sim.ilar  to  the  operations  of  the  relational 
algebra  [8].  However,  the  links  and  selectors  use,  in  their  definition,  cal- 
culus-like  constructions.  In  addition,  the  connections  represented  by  links  can 
be  much  more  complicated  than  the  usual  derived  relational  operations  like 
join,  division,  etc. 

3 .  Data  Definition  Language 

The  Data  Definition  language  of  the  conceptual  schema  has  to  provide 
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constructs  for  the  definition  of  record  types,  selectors,  links  and  expres¬ 
sions.  In  addition,  it  must  have  facilities  for  specifying  integrity  con¬ 
straints  on  the  data  base.  These  constraints  should  not  only  be  interrecord 
but  intrarecord.  We  should  have  facilities  to  restrict  domains'  values, 
specify  keys,  etc.  In  addition,  we  need  to  specify  certain  constraints  along 
access  paths  as  represented  by  links  and  expressions.  A  good  example  of  such 
a  constraint  is  the  unique  owner  rule  of  the  DBTG  proposal. 

We  distinguish  the  statements  involving  the  basic  objects  in  two  cate¬ 
gories;  Intention  statements  and  generation  statements.  Although  their  syntac¬ 
tic  form  is  similar,  their  function  is  completely  different. 

An  intention  statement  specifies  the  constraints  by  which  a  particular 
object  should  abide.  An  intention  statement  may  specify,  for  instance,  that 
a  particular  data  item  is  a  key  of  a  record  type.  It  may  specify  that  a  link 
corresponds  to  a  1:N  relationship  and  not  a  N :M  relationship.  An  intention 
statement  may  constrain  a  selector  to  select  no  more  than  ten  records,  or  not 
to  select  records  with  certain  null  data  items.  An  intention  statement  may 
specify  that  the  records  represented  by  an  expression  have  some  desired  proper¬ 
ties,  which  we  know  semantically  are  true. 

Depending  on  the  time  of  its  application,  an  intention  statement  is  a 
constraint,  or  a  hypothesis,  or  both.  It  specifies  a  particular  intention  which 
may  be  true  or  not  depending  on  the  data  present.  If  it  is  true,  we  may  ask 
that  it  remain  true  in  the  future.  This  means  that  future  modifications  of  the 
data  base  will  have  to  abide  by  the  specified  restrictions. 

A  generation  statement  specifies  some  properties  according  to  which  a 
basic  object  can  be  automatically  generated.  In  the  case  of  record  types, 
a  generation  statement  will  essentially  correspond  to  a  loading  facility  for 
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the  record  type,  or  automatic  generation  according  to  pre-defined  procedures. 

In  the  case  of  links,  it  would  correspond  to  a  property  by  which  the  link  can  be 
generated.  The  perfect  example  is  a  join  in  a  relational  system.  For  selectors, 
the  generation  statement  will  correspond  to  the  Boolean  qualification  of  the 
selector.  In  an  expression,  it  will  correspond  to  the  composite  statement  of 
the  generation  statements  of  the  links  and  selectors  of  the  expression.  One 
could  give  an  independent  generation  statement  for  an  expression,  but  it  is 
better  to  forfeit  such  generality. 

Each  generation  statement  can  define  a  new  path,  which  makes  sense 
semantically.  It  can  also  be  used  to  define  a  fast  access  path,  which  can  be 
possibly  maintained  by  the  system.  It  can  also  serve  as  a  reminder  of  a  com¬ 
plicated  data  connection.  Note  that  in  practice  we  may  not  need  a  very  general 
form  of  generation  statement.  Relational  systems  can  do  very  well  with  rather 
simple  joins. 

Each  basic  object,  i.e.,  record  type,  selector,  link  or  expression, 
may  have  none,  one  or  more  intention  statements.  Each  basic  object  may  or  may 
not  have  a  generation  statement.  We  can  thus  give  a  precise  definition  of  an 
essential,  information  carrying  link.  A  link  is  information  carrying  if  it 
does  not  have  a  generation  statement.  Note  that  we  could  have  said  that  "there 
does  not  exist  a  generation  statement".  The  second  definition  would  be  very 
complicated  to  check. 

We  should  also  distinguish  between  the  definition  and  the  application 
of  the  statement.  Definition  means  stating  the  nature  of  the  statement,  giv¬ 
ing  the  name  and  checking  it  syntactically.  Some  semantic  checking  may  also 
be  appropriate.  However,  the  statement  is  not  enforced  on  the  data.  Neither 
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does  it  result  in  the  generation  of  any  additional  data  base  access  paths. 

The  application  of  an  already  defined  statement  implies  the  execution  of  the 
statement  on  the  data  base.  In  the  case  of  an  Intention  statement,  its 
application  will  control  the  data  base  modifications  later  on.  In  the  case  of 
a  generation  statement,  its  application  will  result  in  a  structure  of  pointers 
which  implement  the  generated  link,  selector,  etc.  The  system  may  he  respon¬ 
sible  for  maintenance  of  the  structure.  The  definition  of  intention  and, 
generation  statements  is  always  welcome.  It  serves  as  a  reminder  of  the  inten¬ 
tion  and  semantics  of  the  data  base.  Their  application,  however,  involves 
questions  of  efficiency  and  should  be  very  carefully  controlled. 

4 .  Intention  Statements 

Consider  for  a  moment  selectors,  links  and  expressions  in  the  same 
light  as  derived  relations  [19].  It  seems  that  we  have,  in  any  relational 
language,  a  full  expression  capability  for  intention  and  generation  statements. 
Unfortunately,  there  are  some  problems.  First,  we  need  to  consider  copies  of 
records  separately.  Consider  for  instance,  two  son  segments  in  a  hierarchical 
system,  each  one  having  a  different  father.  There  are  many  cases  where  the 
contents  of  the  two  sons  are  identical  in  terms  of  data  item  values  as  in 
Figure  1.  For  instance,  x  can 


be  a  job  description  which  is  the  same  for  two  employees.  If  we  look  at  the 
link  connecting  sons  and  fathers  as  a  relation,  we  cannot  distinguish  between 
la  and  lb.  However,  there  are  obvious  differences  between  Figure  la,  lb  in 
terms  of  updates.  In  addition,  the  relationship  expressed  by  lb  is  not  allowed 
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in  a  hierarchical  system. 

We  also  need  to  identify  the  number  of  records  satisfying  a  property. 
This  facility  should  be  in  addition  to  the  existential  and  universal  quanti¬ 
fiers.  For  instance,  in  a  hierarchical  system  a  son  record  should  have  exactly 
one  unique  father  record.  Hence,  in  the  link  relating  rather  and  son  records, 
we  need  to  express  that  for  every  X  record,  there  is  one  unique  Y  record. 

We  will  outline  a  language  for  intention  statements.  We  will  use  free 
variables  for  both  records  and  data  items.  Whenever  there  is  no  ambiguity  we 
will  use  the  names  of  the  records,  or  data  items,  themselves  to  stand  for  the 
corresponding  free  variables.  For  instance,  EMPLOYEE  may  stand  for  the  free 
variable  taking  values  over  the  record  type  EMPLOYEE.  EMPLOYEE  will  be  the 

u  A-J-j 

salary  part  of  the  free  variable  EMPLOYEE.  However,  SAL  will  stand  for  the 
free  variable  taking  values  of  the  data  item  salary.  We  use  this  convention 
only  for  mnemonic  purposes.  Whenever  there  is  a  possibility  for  confusion  we 
will  explicitly  define  free  variables  x,  y,  z,  etc. 

Each  intention  statement  will  have  the  form: 

<label>  intention  in  <object  name>  where  <Free  variable  list>:  <Qualified 
formula^ where : 

<label>  is  a  label  to  identify  the  statement 
-  <object  name>  is  the  name  of  the  record  type,  selector,  link  or  expression 
fob  which  the  intention  is  specified 

<Free  variable  list>  is  a  list  of  free  variables  x,  y,  z,  ...  and  data  item 
names,  or  record  names  that  they  stand  for.  For  example,  "x  for  EMPLOYEE 
and  p  for  SALARY"  is  a  free  variable  list. 

<Quantified  formula>  is  an  expression  of  variable  names,  comparators 


13 


variable  values  and  quantifiers.  The  formula  expresses  a  predicate. 

The  quantifiers  are  the  usual  (V  x) ,  (Jx).  We  also  allow  a  number  of 
other  operators  which,  operating  on  a  set  of  values,  result  in  one 
value.  For  example,  (lx),  (max  x)  ,  (//x)  the  last  standing  for 

number  of  x  items  satisfying  a  given  property.  Boolean  connectives 
are  used  in  the  usual  manner  for  set  operations.  Needless  to  say,  the 
formula  has  to  be  well  formed.  One  cannot  take  the  sum  of  x  when  x 
corresponds  to  a  record  name.  Neither  can  one  compare  variables  or 
expressions  which  are  not  compatible.  All  free  variables  in  the 
<quantified  formula>  have  to  appear  in  the  free  variable  list.  All 
free  variables  have  to  be  bound  by  operators  or  quantifiers.  Any 
record  type  or  data  item  name  appearing  as  a  free  variable  is  part  of 
the  <object  name>  given. 

There  are  cases  where  we  want  to  disregard  copies  of  the  same  record. 
We  will  denote  this  in  our  quantifiers  and  operators  by  putting  the  symbol  (~) 

rsj 

over  them.  For  instance,  (lx)  ,  (//x)  will  stand  for  the  equivalent  operation 
disregarding  copies.  However,  in  normal  use,  copies  will  always  count. 

We  did  not  define  a  completely  formal  language  for  intention  state¬ 
ments.  We  try  to  outline  the  flavour  of  the  language,  not  the  specific  con¬ 
structs.  We  now  give  some  examples  of  statements  and  their  intended  meaning. 

1)  SALARYCEILING  intention  in  EMPLOYEE:  (#SAL)  ISAL  >10,000]  =  0.  The 
statement  specifies  that  no  salary  data  item  of  EMPLOYEE  should 
have  contents  greater  than  10,000.  The  same  property  can  be 


SAL 


expressed  with: 

SALARYCEILING  intention  in  EMPLOYEE  :  (//EMPLOYEE)  [EMPLOYEE 


>10,000]  =  0 
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2)  SALARYDEDUCTIONS  intention  ^  EMPLOYEE: 

0/EMPLOYEE)  [EMPLOYEE^ >EMPLOYEE^^^] 

SAL  DED 

The  statement  specifies  that  every  employee  should  have  salary 
greater  than  his  deductions. 

3)  EMPNOKEY  intention  in  EMPLOYEE: 

(\/EMPNO)  [(#EMPLOYEE)  [EMPLOYEE^,^,,^  =  EMPNO]  <1] 

EMPNO 

This  statement  specifies  that  EMPNO  is  a  key.  The  name  EMPNO  is 
used  once  as  a  free  variable  and  once  as  a  subscript  denoting  a  data  item  in 
EMPLOYEE.  The  statement  could  be  better  described  using  free  variables. 

EMPNOKEY  Intention  in  EMPLOYEE  where  x  for  EMPLOYEE  and  p  for  EMPNO : 

(Vp)  [(#)  =  p]sll 

The  same  property,  i.e.,  declaring  a  key,  can  be  described  using  a 
different  statement. 

EMPNOKEY  intention  in  EMPLOYEE  where  x  for  EMPLOYEE  and  y  for  EMPLOYEE : 

d4)  -  (3y)  J 

Note  that  the  two  statements  express  the  same  property,  but  they 
denote  two  different  algorithms  to  check  it.  In  the  first  case  we  try  to 
verify  that  for  every  EMPNO  possible  there  is  at  most  one  record  with  that  EMPNO. 
In  the  second,  we  verify  that  for  every  record  there  is  no  other  record  with 
the  same  EMPNO  unless  it  is  identical. 

A  third  statement  expressing  a  slightly  different  requirement  is: 

EMPNOKEY  intention  in  EMPLOYEE  where  x  fo£  EMPLOYEE  a^  y  for  EMPLOYEE 
(V'x)  -  (3y)  [Xgj^^Q  -  y 

The  statement  specifies  that  no  two  EMPLOYEE  records  can  have  the 
same  EMPNO.  It  is  equivalent  in  terms  of  intention  to: 


15 


EMPNOKEY  Intention  In  EMPLOYEE  where  x  for  EMPLOYEE  and  p  for  EMPNO: 

<Vp)  W/x)  •  p]  <1] 

4)  Consider  the  case  of  a  mandatory  DBTG  set  called  MANAGEMENT.  The  set 

has  MANAGER  as  owner  record  type  and  EMPLOYEE  as  member  record  type.  The 
property  of  a  unique  owner  can  be  easily  expressed  by: 

DBTGSET  intention  in  MANAGEMENT  where  x  for  MANAGER  and  y  for  EMPLOYEE: 

(V'y)  [(#x)  =1] 

Note  that  we  do  not  have  to  express  any  property  between  x,  y.  The 
"in  MANAGEMENT"  part  of  the  statement  expresses  the  fact  that  x  and  y  have  to 
be  in  the  relationship  MANAGEMENT. 

The  case  of  expressions  is  more  complicated  because  we  need  many  more 
free  variables.  As  a  matter  of  fact,  in  a  more  practical  environment,  it  may 
be  better  to  avoid  specifying  explicit  intention  statements  for  expressions. 
Their  properties  can  be  implied  by  the  links  and  selectors  involved. 

5 .  Generation  Statements 

Generating  instructions  specify  the  method  by  which  a  particular  object  will 
be  generated.  A  generating  instruction  always  follows  the  definition  of  an 
object.  There  can  be  only  one  generation  statement  per  object  and  it  has 
the  form: 

-  Generation  from  <generating  clause> 

or  we  specify  the  generating  instruction 

-  Generation  manual 

The  generating  clause  differs  depending  on  the  object.  In  the  case 
of  a  record  type,  it  is  a  file  name  from  which  the  record  type  will  be  loaded. 
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In  the  case  of  a  selector,  it  is  a  general  boolean  qualification  with  data 
item  names  and  values  which  can  be  used  to  select  appropriate  records. 


In  the  case  of  a  link,  the  generating  clause  is  a  predicate  which 
relates  data  item  values  of  the  two  different  record  types  involved  in  the  link. 
In  case  of  an  link,  we  need  to  use  primed  data  item  names  to  distinguish 
the  two  instances  of  the  same  record  type  being  linked.  A  very  general 
predicate  can  conceptually  be  used  to  link  two  record  types.  However,  the 
generation  and  maintenance  of  such  a  link  may  be  very  hard  [3,  19].  It  is, 
therefore,  better  to  limit  the  scope  of  link  generation.  We  can,  for  instance, 
use  a  set  of  equations  of  the  form  y  =  f(x)  connecting  explicitly  data  item 
names  of  record  type  Y  with  data  item  names  of  record  type  X.  In  a  practical 
environment  it  may  be  sufficient  to  allow  only  equality  between  data  item  names 
and  similar  easy  associations.  In  the  case  of  an  expression,  the  generating 
clause  is  the  expression  definition  using  composition  of  links  and  selectors. 


The  manual  generation  implies  that  the  object  will  be  generated  expli¬ 
citly  with  manual  operations.  Concurrency  indicators  are,  therefore,  associated 
with  the  object  for  generation  purposes.  We  need  one  indicator  for  a  record 
type,  one  for  a  selector,  two  for  a  link  and  two  or  more  for  an  expression 
(manual  generation  of  an  expression  can  be  very  messy) .  The  manual  generation 
of  an  object  is  performed  by  pointing  the  currency  indicators  appropriately 
and  issuing  explicit  commands.  For  instance,  in  the  case  of  a  link^L^  we  slide 
the  X  and  Y  currency  indicators  accordingly  and  then  connect  the  two  records 
explicitly. 


We  could  allow  very  general  generation  statements  by  allowing  free 
variables  to  be  used  in  the  generating  clause.  However,  we  do  not  see  immediately 
the  need  for  this  facility.  In  addition  it  would  be  very  time-consuming  to 
create  the  generated  objects. 
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6 .  DDL  commands 

The  commands  for  the  definition  of  objects  are: 

-  For  a  record  type : 

Definition  record  <name> 

data  item  <name>  <type> 

data  item  <name>  <type> 

Followed  by  one  generating  instruction.  For  instance: 

Generation  manual 

Followed  optionally  by  some  (zero,  one  or  more)  intention  statements. 

-  For  a  selector: 

Definition  selector  <name>  1^  <record  name> 

Followed  by  one  generating  instruction 
Followed  optionally  by  some  intention  statements. 

-  For  a  link: 

Definition  link  <name>  from  <record  name>  <record  name> 

Followed  by  one  generating  instruction 
Followed  optionally  by  some  intention  statements 

-  For  an  expression: 

Definition  expression  <name>  from  <record  name  list>  <record  name> 
Followed  optionally  by  some  intention  statements. 

The  definition  will  only  enrich  the  schema,  but  it  will  not  have  any 
real  effect  on  the  data  base.  In  order  to  apply  the  definitions  on  the  data 
base  we  need  the  following  statements. 
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constrain  <name>  by  label  of  intention  statement> 

release  <name>  from  <Label  of  intention  statement> 

The  statements  will  trigger  the  application,  oir  release  of  the 
pointed  intention  statement  on  the  named  basic  object.  The  intention  state¬ 
ment  should  obviously  be  defined  on  the  corresponding  basic  object. 

In  order  to  create  a  basic  object  as  opposed  to  defining  it,  we  need 

a  series  of  statements  of  the  form: 

create  <»ame> 
destroy  <name> 

The  specified  object  will  be  created  according  to  its  generation 
statement.  In  the  case  of  manual  generation  the  currency  indicators  will  be 
set  up  for  manual  creation. 

It  would  be  nice  to  distinguish  in  some  cases  between  four  kinds  of 
create  commands : 

a)  create  and  maintain.  This  is  especially  crucial  in  the  case  of  an  automatic 
generation  according  to  well  defined  properties.  In  this  case  we  may  want 
the  system  to  maintain  the  created  object  according  to  its  definition  as 
the  data  base  evolves. 

b)  create  and  freeze  record  types.  We  ask  the  system  to  forbid  temporarily 
any  changes  on  all  record  types  affected  by  the  created  basic  object. 

c)  create  and  freeze  record  occurrences.  We  ask  the  system  to  lock  only  the 
record  occurrences  affected.  For  Instance,  in  the  case  of  a  selector  all 
selected  records  can  only  be  modified  compatible  to  the  selector's  defi¬ 
nition.  However,  any  newly  arrived  records  will  not  be  selected. 

d)  create  and  copy.  This  facility  is  especially  appropriate  in  the  case  of 


complicated  selectors,  or  expressions,  whose  results  we  want  to  copy 
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separately.  This  gives  an  independent  snapshot  of  the  data  base. 

Notice  that  (b) ,  (c)  and  (d)  have  very  different  results  as  far  as 
the  data  base  is  concerned  but  present  a  similar  view  to  the  real  world. 

7 .  Example 

We  will  outline,  as  an  example  some  of  the  constraints  we  need  to 
impose  on  a  set  of  records  in  order  to  create  a  hierarchical  data  base. 

Consider  a  set  of  record  types  X,  Y,  strongly  connected  by  links. 

The  links  have  to  obey  the  following  restrictions. 

1)  No  L  is  allowed 

XX 

2)  Between  any  X  and  Y  there  is  at  most  one  link  defined. 

3)  There  is  no  cycle  in  the  links.  That  is,  we  cannot  have  a  set  of  links 

restriction  is  expressed  very  nicely  using  expressions. 
No  expression  of  the  form  E  can  be  obtained  from  the  links.  Notice  the 

X  X 

correspondence  between  (1)  and  (3) . 

4)  No  alternate  paths  are  allowed.  For  instance,  we  cannot  have  links  allow¬ 
ing  the  formation  of  two  alternate  paths  between  record  types  X  and  Y.  The 
requirement  can  be  expressed  again  very  nicely  using  expressions.  Any  two 

expressions  E  ,  E  between  X  and  Z  have  the  same  form  in  terms  of  links, 
z  X  z  X 

That  is,  they  are  isomorphic,  if  we  disregard  the  closure  operators  and  the 
selectors.  Notice  the  correspondence  between  (2)  and  (4). 

5)  Every  link  has  at  least  the  intention  statement: 

LABEL  intention  in  L  :  ( (/x)  [  (#y)  =  1] 

- y  X 

That  is,  all  links  are  completely  functional  in  one  direction. 
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6)  There  is  a  unique  record  type  Z  (called  root)  such  that,  for  all  record 
types  X  connected  to  Z  by  a  link  we  have  the  following  intention  statement. 

LABEL  intention  in  L  :  (Vx)  [ (#Z)  =  1) 

z  X 

That  means  that  every  link  between  the  root  and  another  type  has  to  be 
completely  functional  in  the  direction  going  to  the  root. 

The  uniqueness  of  the  root  and  properties  (1)  to  (5)  force  a  complete 
tree  structure  among  the  record  types.  We  encourage  the  reader  to  prove  it 
to  his  satisfaction. 

In  addition,  we  need  the  existence  of  certain  keys  and  sequence  fields 

7)  There  is  a  key  data  item  KEY  for  the  root  type.  This  requirement  for  the 
root  record  type  Z  is  expressed  with  the  intention  statement: 

LABEL  intention  in  Z:  (\/kEY)  [(#Z)  [Z_^  =KEY]  <  1]] 

KEY 

8)  There  is  a  sequence  field  SF  for  every  other  record  type  (key  within  the 
father) .  This  requirement  is  expressed  as  an  intention  for  the  link  L 

y  X 

between  a  father  record  type  Y  and  its  son  record  type  X. 

LABEL  intention  in  L  :  (^Y)  (VSF)  [  (#X)  =  SF]  <  1] 

-  y  X  or 

Other  intentions  may  be  specified  for  the  hierarchical  data  base.  How 
ever,  the  restrictions  (1)  to  (8)  are  necessary  to  give  to  the  user  an  IMS- 
like  hierarchical  structure.  Notice,  that  other  users  may  have  another  view 
of  this  data  base.  However,  they  are  not  allowed  to  change  anything  which 
affects  the  hierarchical  structure  as  expressed  by  the  Intentions. 

We  will  not  elaborate  on  specifying  a  DBTG  data  base.  It  would  be  a 
real  test  to  our  notation  to  express  the  different  restrictions  and  options 
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of  DBTG.  However,  our  proposed  conceptual  schema  is  network  oriented.  Hence, 
most  of  the  basic  structure  of  DBTG  is  mirrored  automatically. 

Relational  systems  are  not  directly  mirrored  in  our  proposal.  How¬ 
ever,  we  have  argued  in  the  past  that  a  network  framework  will  be  very  appro¬ 
priate  for  relational  implementation  [19] . 

8 .  Concluding  remarks 

This  paper  outlines  only  a  proposal.  Many  details  have  to  be  worked 
out.  Specifically,  we  are  not  insisting  on  the  syntax  of  the  commands.  Only 
the  facilities  are  important  at  this  stage.  We  strongly  believe  that  any  con¬ 
ceptual  schema  should  have  data  pools  and  facilities  for  selection,  connection 
and  their  combination.  In  addition,  we  think  that  the  separation  of  intention 
-  generation  and  definition  -  application  is  important. 

There  was  no  mention  of  a  DML.  Currency  indicators  and  navigation 
can  be  used.  We  tend  to  prefer  many  record  at  a  time  navigation  using  link 
expressions.  We  can  then  peel  off  the  records  from  the  result.  However,  this 
paper  concentrates  more  on  the  data  definition. 

What  we  have  outlined  is  more  conceptual  than  practical.  Many  practical 
issues  like  optimization,  DBA  role,  etc.  are  omitted.  Especially,  we  make  no 
claim  for  compatibility  and  support  of  existing  systems.  It  is  one  thing  to 
support  different  conceptual  organizations.  It  is  quite  different  to  support 
even  one  existing  system  with  all  its  options,  features  and  peculiarities. 

We  have  worked  in  the  past  on  a  prototype  system  (EDBS)  which  supports 
all  three  DBMS  organizations  [13,  15].  However,  our  approach  has  been  rather 
ad  hoc.  We  are  implementing  EDBS  again  trying  to  take  advantage  of  the  common 


mechanisms . 
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